Performance-optimized partitioning of clonotypes from high-throughput immunoglobulin repertoire sequencing data
نویسندگان
چکیده
Motivation: During adaptive immune responses, activated B cells expand and undergo somatic hypermutation of their immunoglobulin (Ig) receptor, forming a clone of diversified cells that can be related back to a common ancestor. Identification of B cell clonotypes from high-throughput Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) data relies on computational analysis. Recently, we proposed an automate method to partition sequences into clonal groups based on single-linkage clustering of the Ig receptor junction region with length-normalized hamming distance metric. This method could identify clonally-related sequences with high confidence on several benchmark experimental and simulated data sets. However, this approach was computationally expensive, and unable to provide estimates of accuracy for new data. Here, a new method is presented that address this computational bottleneck and also provides a study-specific estimation of performance, including sensitivity and specificity. The method uses a finite mixture modeling fitting procedure for learning the parameters of two univariate curves which fit the bimodal distributions of the distance vector between pairs of sequences. These distribution are used to estimate the performance of different threshold choices for partitioning sequences into clonotypes. These performance estimates are validated using simulated and experimental datasets. With this method, clonotypes can be identified from AIRR-seq data with sensitivity and specificity profiles that are user-defined based on the overall goals of the study. Availability: Source code is freely available at the Immcantation Portal: www.immcantation.com under the CC BY-SA 4.0 license. Contact: [email protected]
منابع مشابه
Immune Repertoire Profiling Reveals that Clonally Expanded B and T Cells Infiltrating Diseased Human Kidneys Can Also Be Tracked in Blood
Recent advances in high-throughput sequencing allow for the competitive analysis of the human B and T cell immune repertoire. In this study we compared Immunoglobulin and T cell receptor repertoires of lymphocytes found in kidney and blood samples of 10 patients with various renal diseases based on next-generation sequencing data. We used Biomed-2 primer panels and ImmunExplorer software to seq...
متن کاملHigh frequency of herpesvirus-specific clonotypes in the human T cell repertoire can remain stable over decades with minimal turnover.
High-throughput T cell receptor sequencing on sequentially banked blood samples from healthy individuals has shown that high-frequency clonotypes can remain relatively stable for up to 18 years, with minimal inflation, deflation, or turnover. These populations included T cell expansions specific for Epstein-Barr virus. Thus, in spite of exposure to a barrage of microorganisms over the course of...
متن کاملHigh-throughput sequencing of immunoglobulin genes: Life without a template
Immunoglobulin (that is, antibody) and T cell receptor genes are created through somatic gene rearrangement from gene segment libraries. Immunoglobulin genes are further diversified by somatic hypermutation and selection during the immune response. Studying the repertoires of these genes yields valuable insights into immune system function in infections, aging, autoimmune diseases and cancers. ...
متن کاملRevealing Individual Signatures of Human T Cell CDR3 Sequence Repertoires with Kidera Factors
The recent development of High Throughput Sequencing technologies has enabled an individual's TCR repertoire to be efficiently analysed at the nucleotide level. However, with unique clonotypes ranging in the tens of millions per individual, this approach gives a surfeit of information that is difficult to analyse and interpret in a biological context and gives little information about TCR struc...
متن کاملHigh-Throughput Sequencing of Islet-Infiltrating Memory CD4+ T Cells Reveals a Similar Pattern of TCR Vβ Usage in Prediabetic and Diabetic NOD Mice
Autoreactive memory CD4(+) T cells play a critical role in the development of type 1 diabetes, but it is not yet known how the clonotypic composition and TCRβ repertoire of the memory CD4(+) T cell compartment changes during the transition from prediabetes to diabetes. In this study, we used high-throughput sequencing to analyze the TCRβ repertoire of sorted islet-infiltrating memory CD4(+)CD44...
متن کامل